Maximal Quasi-Bicliques with Balanced Noise Tolerance: Concepts and Co-clustering Applications

نویسندگان

  • Jinyan Li
  • Kelvin Sim
  • Guimei Liu
  • Limsoon Wong
چکیده

The rigid all-versus-all adjacency required by a maximal biclique for its two vertex sets is extremely vulnerable to missing data. In the past, several types of quasi-bicliques have been proposed to tackle this problem, however their noise tolerance is usually unbalanced and can be very skewed. In this paper, we improve the noise tolerance of maximal quasi-bicliques by allowing every vertex to tolerate up to the same number, or the same percentage, of missing edges. This idea leads to a more natural interaction between the two vertex sets— a balanced most-versus-most adjacency. This generalization is also non-trivial, as many large-size maximal quasi-biclique subgraphs do not contain any maximal bicliques. This observation implies that direct expansion from maximal bicliques may not guarantee a complete enumeration of all maximal quasi-bicliques. We present important properties of maximal quasi-bicliques such as a bounded closure property and a fixed point property to design efficient algorithms. Maximal quasi-bicliques are closely related to co-clustering problems such as documents and words co-clustering, images and features coclustering, stocks and financial ratios co-clustering, etc. Here, we demonstrate the usefulness of our concepts using a new application—a bioinformatics example— where prediction of true protein interactions is investigated.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining maximal quasi-bicliques: Novel algorithm and applications in the stock market and protein networks

Several real world applications require mining of bicliques, as they represent correlated pairs of data clusters. However, the mining quality is adversely affected by missing and noisy data. Moreover, some applications only require strong interactions between data members of the pairs, but bicliques are pairs that display complete interactions. We address these two limitations by proposing maxi...

متن کامل

Quasi-bicliques: Complexity and Binding Pairs

Protein-protein interactions (PPIs) are one of the most important mechanisms in cellular processes. To model protein interaction sites, recent studies have suggested to find interacting protein group pairs from large PPI networks at the first step, and then to search conserved motifs within the protein groups to form interacting motif pairs. To consider noise effect and incompleteness of biolog...

متن کامل

Algorithms for induced biclique optimization problems

We present polynomial time algorithms for induced biclique optimization problems in the following families of graphs: polygon-circle graphs, 4-hole-free graphs, complements of intervalfilament graphs and complements of subtree-filament graphs. Such problems are to find maximum: induced bicliques, induced balanced bicliques and induced edge bicliques. These problems have applications for bicliqu...

متن کامل

Extracting large quasi-bicliques using a skeleton-based heuristic

............................................................................................................ iv Chapter 1 Introduction 1.1 Motivation ......................................................................................... 1 1.2 Preliminaries, notation, terminology and definitions ........................ 4 1.3 Quasi – biclique literature review....................................

متن کامل

PSO-inspired BIRCH and Improved Bipartite Graph for Automatic Web Service Composition

Web services are published by the service providers through the internet as independent software components by fulfilling the requirements of the customer requests. As the customer requests are not always satisfied by a single service, the methods of services composition were evolved in which a chain of services are composed together. However the longer search time for the discovery of composed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008